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(54) Locked exchange Fif o 



(57) A FIFO with locked exchange capability is dis- 
closed. The FIFO has a memory for storing and retriev- 
ing data submissions^ a write address generator and a 
read address generator for sequentially addressing the 
memory, A difference counter maintains the difference 
between the number of writes to the queue and reads 
Irom the queue. The net difference, as tracked by the 
counter is a measure of the FIFO utilization. To detect 
the queue fuH condition, a comparator compares the 
maximum FIFO stack depth against the counter output. 
The result of this comparison is latched and provided to 
a write strobe generator so that, in a subsequent write 



operation, if the FIFO is full, the write strobe from the 
producer is blocked and the data will not be written to 
the FIFO. Otherwise, the write strobe from the producer 
is passed to the memory. Additionally, a remaining 
space count is maintained in a status register. During 
operation, a bus master performing the read-modify- 
write cycle to the FIFO reads the status register to find 
the available space in the FIFO and immediately writes 
the data to the FIFO. If the read returnsazero, indicating 
that the FIFO is full, the bus master requeues the data 
for another read-modify-write cycle as it knows that the 
data has not been stored in the FIFO. 
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Description 

The present invention relates in general to a data 
buffer, and more particularly, to a first-in-first-out (FIFO) 
buffer with a locked exchange capability. £ 

Although every new generation of microprocessor 
has delivered an impressive leap in performance over 
the previous generation, more processing power is stifl 
needed by many applications. To meet this insatiable 
need for greater processing capability, computer archi- 
tects are applying a number of techniques, including 
multiprocessing and parallehprocessing, which essen- 
tially deploy a number of processors to process one or 
more tasks simultaneously. An increase in processors 
ideally results in a corresponding increase in computer 
power, assuming the tasks can be allocated to minimize 
interprocessor communication and coordination costs. 
In addition, computer architects are distributing intelli- 
gence at the input/output level by endowing computer 
peripherals with one or more microprocessors The use 
of intelligent peripherals conserves host processor re- 
sources since the local microprocessors perform spe- 
cific functions that the host processors would otherwise 
be required to perform. 

A peripheral with a dedicated processor is dis- 
cussed in U.S. Patent No. 5,101,492, entitled DATA RE- 
DUNDANCY AND RECOVERY PROTECTION, issued 
to Schultz, et al., and assigned to the assignee of the 
present invention, Schultz discloses a personal compu- 
ter having a fault tolerant, intelligent disk array controller 
system capable of managing the operation of an array 
up to eight standard integrated disk drives without su- 
pervision by the computer host Communication ports 
are provided for submitting a command list and for no- 
tifying the host of the completion of requested jobs. 
Through these ports, a host processor can transmit one 
or more high level commands to the disk system and 
retrieve the results from the focal processor overseeing 
the disk sub-system after the local processor has col- 
lected the data from the disk drives. The local micro- 
processor, on receiving this request, builds a data 
processing structure and oversees the execution of the 
command list. Once the execution of the command list 
is finished, the local processor notifies the operating 
system device driver to indicate to the requesting bus 
master that its request has been performed. The local 
processor in Schultz thus off-loads the disk manage- 
ment function from the host processor. 

In a system with multiple processors or bus mas- 
ters, provisions for allocating resources as well as re- 
sponsibilities among various processors are needed. 
Further, the synchronization mechanism has to guaran- 
tee that the bus masters do not modily the resource at 
the same time. In other words, a mutual exclusion be- 
tween system resources such as the ports needs to be 
guaranteed under certain circumstances. Techniques 
that improve multi-processor communication efficiency 
are of great importance because they allow lower cost 



microprocessors and components to perform work that 
previously required the use of more expensive main- 
frames and minicomputers. Increased multiprocessing 
efficiency, therefore, leads directly to computer system 
designs that have lower cost, improved performance, or 
both. 

Prior art solutions to the communication/ synchro- 
nization problem in a multiprocessing system typically 
utilize semaphores and work queues. A semaphore is a 
special flag corresponding to an individual resource to 
control accessing rights in order to prevent mutual inter- 
ference. Traditionally, a register or a memory location is 
used as a semaphore flag. In using the semaphore, a 
bus master reads the semaphore flag. If the flag is clear, 
the bus master sets the semaphore flag to lock the re- 
source and then accesses the resource. Once the bus 
master is done with the resource, it clears the sema- 
phore flag so that other processors or tasks can have 
access to the resource. To ensure an orderly manner of 
setting and clearing the semaphore, the semaphore is 
accessed and changed in an indivisible operation, also 
known as a test and set (TAS) or exchange operation. 

Similar in concept to the semaphore, the work 
queue resides at a predefined address and provides a 
convenient place for the bus masters to drop off their 
requests, which may be high level commands or re- 
quests to the resource Typically, the work queue is or- 
ganized as a first-in-first-out (FIFO) queue so that each 
processor's requests can be processed in the order of 
submission , although other sequencing arrangements 
are also known in the art. To place a request in the work 
queue, the requesting processor queries a work queue 
pointer to determine whether or not the queue has suf- 
ficient space to accept another request. If the queue is 
full, the bus master waits a period of time and rechecks 
the queue. Once the queue has space available, the bus 
master submits the request Once the requested job has 
been completed, the result is communicated to the re- 
questing processor in a number of ways, including in- 
terrupting the requesting bus master with a pointer to 
the results generated. Alternatively, the pointer to the 
results may be placed in a status queue for the proces- 
sors to interrogate and determine the status of the re- 
quest. However, this need to first query for space avail- 
ability and then write the actual data takes time and de- 
lays entry of the data or job into the work queue. It is 
desirable to increase the efficiency of this operation. Ad- 
ditionally, if multiple bus masters are addressing a single 
work queue, semaphore operations must be provided 
to central access to the queue, thus even further in- 
creasing the overhead to provide data, as now the sem- 
aphore must be checked before the work queue status 
can be checked. It would be further desirable to avoid 
the need for this semaphore operation when multiple 
bus masters are present. 

A FIFO with locked exchange capability is provided 
with a memory for storing and retrieving data submis- 
sions. Command pointer data is written to the Fl FO com- 
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mand pointer port by asserting a write strobe. Command 
pointer data is transferred or read from the FIFO when- 
ever a previous command has been completed and the 
command pointer data can be written to a command 
pointer. For each FIFO access, a difference counter 
maintains the difference between the number of writes 
to the queue and reads Irom the queue by decrementing 
the difference output on each data read and increment- 
ing the difference output on each assertion of the write 
strobe. The net result from the counter is a measure of 
the fullness of the FIFO. A remaining space value for 
the FIFO is computed by subtracting the difference out- 
put from the maximum FIFO stack depth. The remaining 
FIFO space is provided as the data obtained in response 
to a read operation of the command pointer port. This 
difference of operation of the command pointer port al- 
lows use of a read/modify/write or exchange operation 
to the port. In the exchange operation, the bus master 
first receives the remaining FIFO space value and then 
writes the command pointer data in a locked operation 
This locked operation prevents another bus master from 
intervening. Therefore, a semaphore operation is not 
necessary. Therefore, if the bus master performs the ex- 
change operation and receives a zero space remaining 
indication, the bus master can assume that the com- 
mand pointer data was not accepted, as the FIFO was 
already full. 

However, because the FIFO can be read at any 
time, it is possible that the FIFO is full at the time the 
FIFO answers the requesting bus master's read of the 
command pointer port, but immediately after answering 
the read, space in the FIFO becomes available due to 
an intervening operation whereby a data item is re- 
moved, or popped, from the FIFO stack before the write 
portion of the exchange operation In this event, the 
FIFO would store the write operation into the recently 
f reed-up space, even though the FIFO had previously 
indicated to the requesting processor that it was full. 
Eventually, the bus master would erroneously resubmit 
its request not knowing that the previous exchange cy- 
cle had in fact already stored the command pointer data 
in the FIFO. This would erroneously result in the com- 
mand being performed twice. 

To remedy this potentially erroneous condition, the 
FIFO full output indication is latched and provided to a 
write strobe generator so that, in the subsequent write 
operation, if the FIFO was indicated to be full, the write 
strobe is blocked and the data will not be written to the 
memory of the FIFO. Therefore, the bus master's as- 
sumption will remain correct. The locked nature of the 
exchange operation ensures that no other bus master 
will be able to perform a write before the bus master 
which read the full status performs its command pointer 
data write. 

During operation, a bus master reads the status of 
the FIFO to find the available space in the FIFO and then 
immediately writes the data to the FIFO. If the result of 
the status read equals zero, indicating that the Fi FO was 
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full, the bus master requeues the data since the data 
has not been accepted by the FIFO. Alternatively, if the 
result of the status read is greater than zero, the bus 
master knows that its submission has been accepted 
s By ensuring that the FIFO does not accept the write 

operation from the requesting processor even if the 
FI FO space was availabl e su bsequent to the status read 
of the FIFO, a locked exchange FIFO is provided for a 
reliable submission of data, with the exchange ope ration 
preventing interruption by another bus master. Thus, the 
need to perform a semaphore operation is removed as 
is the need to query for space and then write the data. 
A simple exchange operation is used, thus increasing 
efficiency, as desired. Other objects, features, and ad- 
vantages of the present invention will be apparent from 
the accompanying drawings and from the detailed de- 
scription that follows below. 

A better understanding of the present invention can 
be obtained when the following detailed description of 
the preferred embodiment is considered in conjunction 
with the following drawings, in which: 

Figure 1 is a block diagram of a disk array system 
containing the locked exchange FIFO of the present 
invention; 

Figure 2 is a block diagram of the DRAM interface 
of Figure 1; 

Figure 3 is a block diagram of the command pointer 
FIFO of Figure 3; 

Figure 4 is a block diagram of the write address gen- 
erator of Figure 4; 

Figures is a block diagram of the read address gen- 
erator of Figure 4; 

Figure 6 is a block diagram of the differential coun- 
ter of Figure 4; 

Figure 7 is a block diagram of the FIFO empty FIFO 
full, and the command pointer FI FO status register; 
Figure 8 is a block diagram of the lock exchange 
circuit for the command pointer FIFO of Figure 4; 
and 

Figure 9 is a flowchart of a procedure to access the 
command pointer FIFO register. 

Turning to the drawings, Figure 1 discloses a block 
diagram of a computer system S having an intelligent 
disk array system 1 01 containing a FIFO with locked ex- 
change capability. For purposes of illustration only, and 
not to limit generality, the invention wili be described with 
reference to its operation within a disk array system. 

The computer system S has a plurality of host proc- 
essors 90 and 92. These host processors are connected 
to a host bus 94. The host bus 94 is a relatively high 
speed bus in comparison with a peripheral bus 100, 
preferably an EISA bus, which is provided to interface 
the system S with a plurality of peripherals. A memory 
array 98 is positioned between the host bus 94 and the 
EISA bus 100. Additionally, a host bus to EISA bus 
bridge 96 is placed between the two buses to transfer 
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data from one bus to the other. The EISA bus has one 
or more slots 1 03> upon which the disk array system is 
connected to. Although the bus 100 is illustrated as be- 
ing an EISA bus, it may alternatively be a PCI bus, or 
any other suitable buses. 

During the operation of the computer system, the 
bus master issues I/O requests, such as disk read and 
write requests, to the intelligent disk array system 101 
to request that data be transferred over the EISA bus 
100. The EISA bus 100 is connected to an EISA bridge 
104, which is connected to the disk array system via a 
PCI local bus 102. The dual bus hierarchy of Figure 1 
albws lor concurrent operations on both buses. The EI- 
SA bridge 104 also performs data buffering which per- 
mits concurrency for operations that cross over from one 
bus into another bus. For example, an EtSA device 
could post data into the bridge 104, permitting the PCI 
local bus transaction to complete independently, freeing 
the EISA bus 100 for further transactions. 

The PCI local bus 102 is further connected to a 
processor to PCI bridge 110. The other side of the proc- 
essor to PCI bridge 11 0 is connected to a local proces- 
sor 106 which oversees the operation of the intelligent 
disk array system 101 , including the caching of the disk 
data, among others. The processor to PCI bridge 110 
interfaces the local processor 106 to the local PCI bus 
1 02 to provide host access to the local processor sup- 
port functions and to enable the local processor to ac- 
cess resources on the PCI bus 102 The bridge 110 per- 
forms a number of functions, including big endian to little 
endian format conversions, interrupt controls, local 
processor DRAM interfacing, and decoding for the local 
processor ports, among others. 

The PCI local bus 1 02 is also connected to a DRAM 
interface 118, which in turn is connected to a DRAM 
memory array 116. The DRAM interface 118 and the 
DRAM memory array 116 can support either a 32 or a 
64-bit data path with a parity protected interface and/or 
an 8-bit error detection and correction of the DRAM ar- 
ray data. The DRAM array 116 provides a buffer which 
can serve, among others, as a disk caching memory to 
increase the system throughput. In addition to support- 
ing the DRAM array 116, the DRAM interface 118 sup- 
ports three hardware commands essential for drive ar- 
ray operations: memory to memory move operation, ze- 
ro fill operation and zero detect operation. The memory 
to memory move operation moves data from system 
memory 98 to a write cache located in the DRAM array 
116 during write posting operations. Also, on cache hits 
to previously posted data still residing in cache, a bus 
master in the DRAM interface 118 is programmed to 
move the data in the write cache to the system memory 
98 Further, the movement of data located within the 
DRAM array 116 is supported by the DRAM interface 
118. The second hardware command supporting drive 
array operations is a zero-fill command, which is used 
to initialize the XOR buffer for RAID 4 and 5 operations. 
Finally, the DRAM interface bus master supports a zero 



detect operation, which is used in RAID 1, RAID 4, and 
RAI D 5 operations to check redundant disk data integ- 
rity. 

The PCI local bus 102 is also connected to one or 
5 more disk controllers 112 which is further connected to 
a plurality of disk drives 114. Each of the plurality of disk 
controllers 11 2 is preferably configured for a small com- 
puter systems interface (SCSI) type interface and oper- 
ate as PCI bus masters. As shown in Figure 1 , the local 
processor 106 may, through the processor to PCI bridge 
1 1 0, access the DRAM array 116 via the DRAM interface 
118 or the disk drives 114 via the disk controllers 112. 
Similarly, a host processor can, through the EISA bus 
100 and through the EISA bridge 104, access the PCI 
local bus 1 02 to communicate with Ihe processor to PCI 
bridge 110, the DRAM interface 118, or the disk control- 
lers 112 to acquire the necessary data. 

During operation, the host processor 90 sets up one 
or more command descriptor blocks (CDBs) to point to 
a host command packet in the memory array 98. The 
host processor 90 writes the address of the CDB to the 
processor to PCI bridge 110 preferably using an ex- 
change operation, with the processor to PCI bridge 110 
storing the CDB in a command FIFO, which is preferably 
a locked exchange FIFOaccording tothe present inven- 
tion. The processor to PCI bridge 110 then retrieves the 
CDB from the memory array 98 into a command list 
FIFO in the processor to PCI bridge 110 and informs the 
local processor 106 that a command list is available for 
processing. The local processor 106 parses the com- 
mand list for commands. The local processor 106 then 
builds CDBs in the DRAM memory array 116 for each 
command. Next, the iocal processor 106 issues re- 
quests or command pointers for the local CDBs to the 
DRAM interface 1 1 8 as necessary to read or write data 
to the memory array 98 or other host memory. The local 
processor 106 issues these command pointers for the 
local CDBs to a locked exchange FIFO according to the 
present invention. The DRAM interface 118 then per- 
forms the operations indicated in the local CDBs. 

Figure 2 shows in more detail the DRAM interface 
118. In the upper portion of Figure 2, a PCI bus master 
120 is connected to a bus master read FIFO 122 which 
buffers data to the bus master. The bus master 1 20 is 
also connected to a command FIFO 1 34 which accepts 
and parses commands for operation by the bus master. 
The bus master 1 20 is further connected to a byte trans- 
late block 128 which performs the necessary byte align- 
ment operations between the source and destination. A 
bus master internal to external move controller 124 is 
connected to the bus master read FIFO 122 and to a 
second byte translate block 1 30. The bus master inter- 
nal to external move controller 124 handles operations 
where data is transferred to host memory 98 from the 
internal DRAM array 116. The bus master internal to ex- 
ternal move controller 124 is connected tothe command 
FIFO 1 34 to receive operational control. The outputs of 
the byte translate block 130 are connected to a DRAM 
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resource arbiter 144 and a DRAM controller 146 to en- 
able the bus master 120 to directly access the DRAM 
array 1 1 6. 

The command FIFO block 1 34 is also connected to 
control a bus master zero fill block 1 26, which in turn is 
connected to a bus master write FIFO 1 38- The com- 
mand FIFO block 1 34 is also connected to control a bus 
master zero check block 1 32 which is connected to the 
DRAM resource arbiter 144 and the DRAM controller 
146. The zero fill block 126 supports the zero-fill com- 
mand, which is used to initialize an XOR buffer in mem- 
ory to zero values for RAID 4 and 5 operations. The zero 
check block 132 supports the zero detect operation, 
which is used in RAID 1 , RAID 4, and RAID 5 operations 
to check redundant disk data integrity. 

To support the memory move commands inside the 
DRAM array 1 1 6, the command FIFO block 1 34 is con- 
nected to a bus master internal to internal move control- 
ler 140. The internal to internal move controller 1 40 han- 
dles transfers from one location in the DRAM array 116 
to another location in the DRAM array 116. The com- 
mand FIFO block 134 also controls a bus master exter- 
nal to internal move controller 1 36, which controller 136 
transfers data from host memory 98 to the internal 
DRAM array 116. The translate blocks provide byte 
alignment The byte translate block 128 is connected to 
the bus master external to internal controller 1 36 as well 
as the bus master write FIFO 1 38. The bus master write 
FIFO 138 is connected to a byte and double word trans- 
late block 142 as well as the bus master internal to in- 
ternal move controller 1 40. The internal to internal move 
controller 1 40 is connected to the byte and double word 
translate block 142, whose output is connected to the 
DRAM controller 146. The bus master write FIFO 138 
is connected to the DRAM controlier 1 46, and the DRAM 
resource arbite r 1 44 for buffering and translating the da- 
ta transfers between the bus master 120 and the DRAM 
array 116. Thus the described circuits facilitate the 
transfer of data between the DRAM array 116 and the 
bus master 120, 

The lower portion of Figure 2 shows in more detail 
a block diagram of a PCI bus slave. In Figure 2, a PCI 
bus siave 168 is connected to the command FIFO block 
134, a least recently used (LRU) hit coherency block 
164, a plurality of PCI configuration registers 166 and a 
PCI bus slave write FIFO 162. The PCI bus slave write 
FIFO 162 is a speed matching FIFO that allows for the 
posting of writes by the bus slave 168. 

The PCI configuration registers 1 66 are registers for 
storing the configuration of the DRAM interface 118. 
These registers contain information such as vendor 
identification, device identification, command, status, 
revision identification, class code, cache line size, I/O 
register map base address, memory register map base 
address, DRAM memory base address, DRAM config- 
uration register, and refresh counter initialization set- 
tings, among others. 

The LRU hit coherency block 164 provides a local 



script fetching mechanism which effectively provides a 
read ahead coherent cache to minimize the wait time on 
the disk controllers 112 when fetching instructions or da- 
ta from the DRAM array 116. The LRU hit coherency 
s block 164 is connected to a plurality of bus slave read 
FIFOs 152-160. Each of the read FIFOs 152-160 is in 
turn connected to the DRAM resource arbiter 144 and 
the DRAM controller 1 46. Upon a read hit, data from the 
read FIFOs can be immediately provided to the bus 
slave 1 68 to improve system throughput In the event of 
a read miss, the FIFO buffer follows an adaptive re- 
placement policy, preferably the least recently used al- 
gorithm, to ensure optimal performance in multi-thread- 
ed applications. To ensure coherency of the data stored 
in the read FIFOs, all memory accesses are locked to 
the DRAM controller 146 through the PCI bus slave 168. 
Thus, as long as a PCI bus master is connected to the 
PCI bus slave 168, all writes to the DRAM 116 will be 
blocked to ensure coherency of information associated 
with the read FIFOs during the slave transfer. Further, 
any time the bus slave 168 is inactive, the LRU block 
1 64 snoops writes to the DRAM controller 1 46 to deter- 
mine if invalidation cycles to the read FIFOs 152-160 
are needed. 

The refresh counter 150 provides various refresh 
cycles to the DRAM array 116, including CAS BEFORE 
RAS (C8R) refresh cycles. The CBR refresh cycles are 
stacked two-deep such that a preemption of an on -going 
access occurs only when that cycle is at least two re- 
fresh periods long. The refresh counter block 1 50 is also 
connected to the DRAM resource arbiter 1 44 to ensure 
that the refresh cycles to the DRAM array 116 are not 
untimely delayed 

The DRAM resource arbiter 144 controls all re- 
quests to access the DRAM. The resource arbiter 144 
provides the highest priority to requests from the CBR 
refresh counter block 1 50, followed by requests from the 
bus slave write Fl F0 1 62, followed by requests from the 
read FIFO banks 1 52-160, and finally requests from the 
bus master command FIFO 1 34. 

The CP FIFO register is located in the command 
pointer FIFO 1 34 and may be accessed by a bus master 
via the PCI bus slave 168, which connects with the com- 
mand FIFO block 134 to provide a communication chan- 
nel between the bus master and the controller. The CP 
FIFO register has a read access mode in which the re- 
maining data words that can be written into the FIFO are 
provided. It also has a write access mode where the ad- 
dress of the next CDB or command pointer can be in- 
serted into the FIFO 1 34, The value read from the com- 
mand pointer FIFO register indicates the number of 
command pointers that can be accepted by the control- 
ler: a value of zero from the CP FIFO register indicates 
that the CP FIFO 1 34 is full and that the CP FIFO 1 34 
will not accept another command pointer, while a non- 
zero value from the CP FIFO register indicates that the 
bus master can submit that many command pointers 
consecutively without having to read the FIFO 134 to 
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verify the availability of space in the FIFO 134. Conse- 
quently, the CP FJFO register should be read before 
submitting the first command and before each time the 
number of consecutive CDB submissions equals the 
value last status read from the CP FIFO register. 

In addition, when the CP FIFO register is read, the 
FIFO remaining count indication is latched and subse- 
quent writes to the CP FIFO register will not update the 
memory of the CP FIFO 134 unless the previous read 
of the CP FIFO register indicated that the CP FIFO 1 34 
is not lull and can accept another command pointer. 

A second Fl FO called a command completion point- 
er (CCP) FIFO provides a channel for the bus master to 
receive notifications of a completed command list from 
the intelligent disk array system The CCP FIFO can 
preferably hold up to sixteen double words, each of 
which is the address of an individual command list that 
has been completed by the controller When read, the 
CCP FIFO register will either return the address of a 
completed command list or a value of zero A value of 
zero indicates that none of the requested commands 
has been completed at the time of the status read. When 
a non-zero value is read from this register, the value re- 
turned is the address of a completed command list. 

Turning to Figure 3, the command pointer FIFO 1 34 
of Figure 2 is shown in more detail. In Figure 3, a bus 
master 1 20, or the producer, is connected to one side 
of the FIFO 134, while a consumer 121 is connected to 
the other side of the FIF0 1 34. The bus master 120 pro- 
duces data by writing to the FIFO 1 34 wh ile the consum- 
er 121 consumes data by reading from the FIFO 134. 
The consumer 1 21 reads data from the FIFO 1 34 by as- 
serting a read strobe signal which is received by the read 
input of a memory 200. The bus master 120 is also con- 
nected to the write data bus of the memory 200, to a 
write strobe generator 204, and to a CP status register 
209. Preferably, the memory 200 is a 16 position, 32 bit 
wide memory which acts as the FIFO data storage. On 
the other side of the FIF0 1 34, the consumer 121 is con- 
nected to the read data bus of the memory 200 and a 
read signal of the read address generator 206. The read 
address generator 206 drives the read address input to 
the memory 200 to deliver the next data in the FIFO 1 34 
to the consumer 121. The read address generator 206 
is also connected to a difference counter 208 whose out- 
put is provided to the CP status register 209 to provide 
the remaining FIFO space value to the bus master 120 
upon a read of the CP FIFO status register 209. Addi- 
tion ally, the difference counter 203 is also connected to 
the write strobe generator 204. Thus, the difference 
counter 208 monitors consumer read operations as well 
as producer write operations to detect the full condition 
in the FIFO 134. Thus, the CP FIFO status register 209 
is the read portion of the previously described CP FIFO 
register. The output of the difference counter 208 and 
the write signal from the bus master 120 are provided 
to the write strobe generator 204 to generate the write 
strobe for the memory 200, The strobe generator 204 
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detects and latches the FIFO full condition and gener- 
ates a write strobe to the memory 200 when the bus 
master 120 writes to the FIFO 134 and the FIFO 134 
was not full during the last query of the CP status register 
5 209. Thus, the write operation to the memory 200 is the 
write portion of the CP FIFO register previously de- 
scribed. When a write to the memory 200 occurs, the 
strobe generator 204 also causes the write address gen- 
erator 202 to increment and point to the next available 
memory location to be written. Also, each write to the 
address generator 202 increases the output from the dif- 
ference counter 208 by one until the utilized space count 
output preferably reaches sixteen, upon which the FIFO 
134 is full. 

On the other side of the memory 200, a consumer 
interface is provided so that the command pointer FIFO 
134 can be read by a consumer 121. A read address 
generator 206 increments the read address every time 
the consumer 121 reads from the FIFO 134. Each read 
from the FIFO 134 removes a data element from the 
memory 200 and thus decreases the difference counter 
208 output to reduce the utilized space count by one 
until the count reaches zero, upon which the FIFO 134 
is empty. Thus, the difference counter 208 computes the 
difference between the number of writes and the 
number of reads. The output of the difference counter 
208 is subtracted from the maximum stack depth value, 
preferably sixteen, to generate an output to the CP sta- 
tus register 209 indicating the remaining FIFO 134 
memory locations. The difference counter 208 output is 
also latched and gated with the next write signal to pre- 
vent data from being written to the memory 200 in the 
event the FIFO 134 is full during the last CP FIFO reg- 
ister read access. 

Figures 4 and 5 disclose in more detail the write ad- 
dress generator 202 and the read address generator 
206 of Figure 3. For each push operation, or a FIFO 
write, the write address generator 202 increments a 
write address pointer to address the next location in the 
memory 200. Similarly, for each pop operation, or a 
FIFO read operation, the read address generator 206 
increments the read address pointer to address the next 
space in the memory 200. 

Turning to Figure 4, a circuit for generating the write 
address signals is disclosed. In Figure 4, the current 
write address value is stored in a plurality of flip-flops 
232-238. The set input of the flip-flops 232-238 is tied 
to a logic high, while the reset input of the flip-flops 
232-238 is collectively connected to the RESET* signal 
to clear the flip-flops 232-238 upon reset Further, a 
clock signal CLK is connected to the clock input of the 
flip-flops 232-238. Upon power-up, the write address is 
cleared to zero by the RESET* signal. The outputs of 
the flip-flops 232-238 are presented to an address in- 
crementer 220, which is essentially an adder with one 
input preset to one and the other input connected to the 
current write address output of the flip-flops 232-238. 
The output of the address incrementer 220 is provided 
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to a multiplexer 230. The multiplexer 230 is selected us- 
ing the write strobe signal WR_FIFO from the strobe 
generator 204. 

During a write cycle, the output of the address in- 
crementer 220 is coupled to the input of the write ad- 
dress flip-flops 232-230. During all other times, the cur- 
rent write address information is cyded back to the in- 
puts of the flip-flops 232-238 

The address incrementer 220 takes the address 
output and adds one to it to increment the address point- 
er The output ol the address incrementer 220 is provid- 
ed to the other input of the multiplexer 230. The multi- 
plexer 230 is gated by the write strobe signal so that, 
upon a write, the incremented address is provided to the 
inputs of flip-flops 232-236 to be latched, or alternatively, 
to provide the current write address when operations 
other than writes are directed at the FIFO 134. 

Turning to Figure 5, the circuit for generating the 
read address bus is disclosed In Figure 5, an address 
incrementer 240 uses an adder with one input precon- 
figured to one and the other input connected to the cur- 
rent read address. The output of the address increment- 
er 240 is provided to a multiplexer 242. During a read 
cycle, the output of the address incrementer 240 is cou- 
pled to the input of read address flip-flops 244-250. The 
set input of the flip-flops 232-238 is tied to a logic high, 
while the reset input of the flip-flops 244-250 is collec- 
tively connected to the RESET* signal. A clock signal 
CLK is connected to the clock input of the flip-flops 
244-250, Upon power-up, the read address is cleared 
to zero by the reset signal. The multiplexer 243 is gated 
by the read enable signal RD_FIFO so that, upon a read, 
the incremented address is provided to the flip-flops 
244-250 to be latched in, or alternatively when read op- 
erations are not being performed, the current read ad- 
dress value is looped back. 

Figure 6 shows in more detail the difference counter 
208 of Figure 3, In Figure 6, a differential count gener- 
ator 254 tracks the differences between the number of 
reads and writes to the FIFO 1 34 and thus keeps count 
of the usage of the FIFO 1 34, The differential count gen- 
erator 254 is essentially an adder/subtractor with one 
input prewired to one and the other input connected to 
the differential count output of flip-flops 258-266. The 
differential count generator 254 has an add input and a 
subtract input. The add input is connected to WR_FIFO, 
the FIFO write strobe signal, and the subtract input is 
connected to RD_FIFO, the FIFO read signal, so that 
every write strobe assertion increments the difference 
count while every read operation decrements the differ- 
ence count. RD_FIFO and WR_FIFO are OR'd by an 
OR gate 252 The output of gate 252 is provided to the 
select input of a multiplexer 256. The output of the dif- 
ferential count generator 254 is also connected to one 
input of the multiplexer 256, while the other input of the 
multiplexer 256 is connected to the outputs of the flip- 
flops 258-266. The output of the multiplexer 256 is pro- 
vided to the inputs of flip-flops 258-266. The set input of 



flip-flops 258-266 are connected collectively to a logic 
high, while the reset inputs are collectively connected to 
the RESET* so that upon reset, the difference count is 
zero. Finally, all the clock inputs of flip-flops 258-266 are 

5 connected to CLK to be clocked. 

Figure 7 discloses the circuits to compute the space 
available in the FIFO 1 34 as well as to detect the FIFO 
1 34 empty and FIFO 1 34 full conditions In Figure 7, the 
DIFF CNT value is provided from the flip-flops 258-266 

io to comparators 268 and 270 and to a subtractor 272 
The comparator 268 compares the DIFF_CNT value 
with the value of zero. In the event of a match, the com- 
parator 268 asserts a FIFO_EMPTY signal to indicate 
the DIFF_CNT value is equal to zero. Similarly, thecom- 
as parator 270 compares DIFF_CNT value with the value 
of sixteen, the maximum FIFO depth in the preferred 
embodiment. In the event of a match, the comparator 
270 asserts a FIFO_FULL signal to indicate that the 
FJ F0 1 34 cannot accept any more data. The Dl FF_CNT 

20 value Is also provided to a subtractor 272 which sub- 
tracts the difference value from the maximum FIFO 
depth of sixteen in the preferred embodiment. The out- 
put from the subtractor 272 is provided to the status reg- 
ister 209 whose output is enabled onto the PCI local bus 

25 whenever the FIFO status register 209 is read. The sta- 
tus register 209 is enabled when a READ signal, indi- 
cating a PCI read operation, and a COMMAN D_ 
POINTER_FIFO_ADDRESS_DECODE signal, a de- 
coded signal indicating that the bus master has selected 

30 the CP FIFO register, are asserted to a NAND gate 274. 
The output of the NAND gate 274 is provided to the low- 
going enable input of the buffer 276 to drive the value 
of the remaining FIFO space count onto the PCI local 
bus 1 02 for the bus master to read 

35 Turning to Figure 8, the circuit for providing the write 
blocking signals to the memory 200 during a locked ex- 
change cycle is disclosed. In Figure 8, LATCHED JOR and 
LATCHED_MRD_REG signals are provided to an OR gate 
278 to decode an I/O or memory read operation. The out- 

40 put of the OR gate 278 is connected to the input of an AND 
gate 2B0, whose other input is connected to COMMAND_ 
POINTER_FIFO_ADDRESS_DECODE, a decoded sig- 
nal indicating that a bus master is selecting the CP FIFO 
134. The output of the AND gate 280 is provided to the 

45 select input of a multiplexer 282. One input of the multi- 
plexer 282 is connected to the FIFO_FULL signal of Figure 
7 while the other input is connected to the output of a flip- 
flop 286, which is the RELATIVE. TO_WHAT_WAS_ 
READ_FIFO_FULL signal. Thus, when a read is directed 

so to the CP FIFO register, FIFO_FULL is gated to an AND 
gate 284. Otherwise, REL ATI VE_TO_WHAT_WAS_ 
RE AD„FI FO_F ULL is looped back. The output from the 
multiplexer 282 is provided to one input of the AND gate 
284, while the other input of the AND gate is connected to 

55 LOAD_REG_READ_DATA, a signal from a PCI bus state 
machine indicating a data read operation is occurring. Fi- 
nally, the output of the AND gate 284 is connected to the 
input of the flip-flop 286. The reset signal RESET* is con- 
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nected to the set input so that, after reset, the FIFO 134 is 
set up to be full. The flip-flop 286 is clocked by the CLK 
signal 

The circuit for blocking the write strobe signal will now 
be discussed. In Figure 8, READ_MEM_REG and 
READ_IOR_REG signals are provided to an OR gate 290 
to decode a register read. The output of the gate 290 is 
provided to one input of an AND gate 292, while the other 
input of the AND gate 292 is connected to the 
COMMAND_POINTER_FIFO_ADDRESS_DECODE sig- 
nal. The output of the AND gate 292 is connected to the 
select input of a multiplexer 294 One input of the multiplex- 
er 294 is the RELATI VE_TO_WH AT_WAS_RE AD_FI FO_ 
FULL signal, while the other input is connected to 
BLOCK_WRITE, the output of a flip-flop 296. The output of 
the multiplexer 294 is provided to the input of the flip-flop 
296, while the output of the flip-flop 296 is provided to the 
input of the multiplexer 294 as the block write signal. 

The reset input of flip-flops 286 and 296 are tied to 
a logic high signal, while the set inputs of flip-flops 266 
and 296 are tied to the RESET* signal to clear both flip- 
flops upon reset. The flip-flop 296 is clocked by the CLK 
signal 

The output of the flip-flop 296 is connected to the 
input of an inverter 298, whose output is provided to one 
input of an AND gate 302. The COMMAND_POI NTE R_ 
FIFO_ADDRESS_DECODE signal is provided as an in- 
put to the AND gate 302 The WRITE_IO_REG signal 
and the WRITE_MEM_REG signals are connected to 
an OR gate 300 whose output is connected to one input 
of the AND gate 302 to indicate that a write operation is 
directed to the CP FIFO register Finally, PCI_BS_ 
REQUESTING_BE, a decoded PCI bus request enable 
signal is also provided to one input of the AND gate 302 
to further decode the write operation. The AND gate 302 
generates the WR_FIFO signal. Thus, when the FIFO 
134 was full at the time the CP FIFO register was read, 
the WRITE_STROBE signal is disabled until the next 
time the CP FIFO register is read and the FIFO 134 is 
not full. Thus, the FIFO 1 34 cannot be written into during 
the locked exchange operation when the FIFO 134 is 
full. 

Figure 9 illustrates the use of the exchange instruc- 
tion in conjunction with the locked exchange FIFO 1 34. 
This process involves an exchange XCHG instruction at 
step 310 which incorporates a read operation of the CP 
FIFO register in step 310 immediately followed by a 
write operation to the CP FIFO register, with the read 
and write operations locked due to the nature of the ex- 
change operation The bus master exchange instruction 
will be to exchange the command pointer value with the 
value in the FIFO 134. The use of the exchange opera- 
tion is particularly efficient because the bus master can 
poll the CP FIFO register and write the command pointer 
in one step The locked exchange circuitry of the FIFO 
134 ignores any write operation that follows a read of a 
zero value from the CP FIFO register When the locked 
exchange is complete, the bus master examines the da- 
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ta that was read in step 31 2 and determines if the FIFO 
134 was not full, as indicated by a non-zero value A 
value of zero indicates that a submission was not suc- 
cessful and must be tried again. A value of non-zero in- 

£ dicates that there was room in the FIFO 134 and that 
the submission was successful 

Thus the multiple steps of the prior art to check the 
full state of the FIFO and then write data if not full are 
replaced by a single exchange operation, with the write 

10 inhibiting nature until a non-full read occurs ensuring 
that data is not improperly written. By then allowing the 
use of an exchange operation, multiple bus masters 
need not use a semaphore to share the FIFO because 
of the locked nature of the exchange operation. 

is The foregoing disclosure and description of the in- 
vention are illustrative and explanatory thereof, and var- 
ious changes in the size, shape, matertals : components, 
circuit elements, wiring connections and contacts, as 
well as in the details of the illustrated circuitry and con- 

20 struction and method of operation may be made without 
departing from the spirit of the invention. 



Claims 

25 

1. A data buffer for sequentially storing data from a 
producer and providing said data to a consumer, 
said producer generating a write strobe signal, said 
consumer generating a read strobe signal, said da- 
30 ta buffer comprising: 

a memory having a memory write strobe input 
for receiving a memory write strobe signal and 
a read strobe input for receiving said read 

55 strobe signal from said consumer, said memory 

storing data upon receipt of said memory write 
strobe signal, said memory providing data to 
said consumer upon receipt of said read strobe 
signal, said memory having a memory capacity; 

40 a memory full detector having a read input and 

a write input respectively coupled to said read 
strobe signal from said consumer and said write 
strobe signal from said producer and having a 
full output, said memory full detector asserting 

45 said full output when the difference between the 

number of said read and write strobe signal as- 
sertions equals said memory capacity; and 
a write enable generator having a write strobe 
input for receiving said write strobe signal from 

so said producer, a full input coupled to said full 

output, and a write strobe output providing a 
write strobe signal connected to said memory 
write strobe input, said write enable generator 
blocking the write strobe signal from said pro- 

ss ducer to said memory when said full output is 

asserted and passing the write strobe signal 
from said producer to said memory when said 
full output is negated. 
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2. The data buffer of claim 1 , wherein said memory full 
detector further comprises: 

a difference counter having a read input and a 
write input respectively coupled to said read 
and write strobe signals, said difference coun- 
ter having difference outputs for storing a dif- 
ference value, said difference counter decre- 
menting the difference outputs on each asser- 
tion of said read strobe signal and incrementing 
the difference outputs on each assertion of said 
write strobe signal; and 
a comparator having an input coupled to said 
diflerence outputs and having an output cou- 
pled to said full output, said comparator assert- 
ing said full output when said difference value 
equals said memory capacity. 

3. The data buffer of claim 2, wherein said difference 
counter includes an adder/subtractor having an in- 
put preset to one. 

4. The data buffer of claim 2, wherein said difference 
counter includes a subtractor having first inputs 
coupled to said memory capacity and second inputs 
coupled to said difference outputs, said subtractor 
having remaining space outputs for providing a re- 
maining space signal, said subtractor subtracting 
said difference outputs from said memory capacity 
to generate said remaining space signal. 

5. The data buffer of claim 4, wherein said producer 
generates a read strobe signal, the data buffer fur- 
ther comprising: 

a status register having a producer read input 
strobe for receiving said read strobe signal from 
said producer and status outputs for presenting a 
status signal, said status register presenting said 
status signal to said producer upon receipt of said 



generator passes said write strobe signal from said 



producer to said memory when said lull output is 
negated after said status register has presented 
said status signal to said producer 

s 10. The data buffer of claim 1 , wherein said memory has 
write address input for receiving a write address sig- 
nal, the data buffer further comprising a write ad- 
dress generator having a write strobe input for re- 
ceiving said write strobe signal and having write ad- 

10 dress outputs coupled to said write address inputs 
for providing a write address signal to said memory, 
said write address generator incrementing the val- 
ue of said write address srgnal upon receipt of said 
memory write strobe signal. 

15 

11. The data buffer of claim 1, wherein said memory has 
read address inputs for receiving a read address 
signal, the data buffer further comprising a read ad- 
dress generator having a read strobe input for re- 

20 ceiving said read strobe signal and having read ad- 
dress outputs coupled to said read address inputs 
for providing a read address signal to said memory, 
said read address generator incrementing the value 
of said read address signal upon receipt of said read 

2$ strobe signal. 

12. A computer system comprising: 

a producer generating a write strobe signal 
so when providing data; 

a consumer generating a read strobe signal 
when requesting data; and 
a data buffer according to any of claims 1 to 11 
connected to said producer and said consumer. 

35 

13. A disk controller comprising: 

a producer generating a write strobe signal 
when providing data; 

a consumer generating a read strobe signal 
when requesting data; and 
a data buffer according to any of claims 1 to 11 
connected to said producer and said consumer 

14. A method for sequentially storing data from a pro- 
ducer into a memory, said memory having a write 
strobe input for receiving a write strobe signal, said 
memory storing data upon receipt of said write 
strobe signal, said method comprising the steps of: 

receiving a status read from said producer to 
said memory; 

determining an available space count associat- 
ed with said memory and providing said avail- 
able space count to said producer in response 
to said status read; 

receiving write data and a write strobe signal 
from said producer; and 



producer read input strobe. 40 

6. The data buffer of claim 5, wherein said status sig- 
nal reflects the remaining space in said data buffer. 

7. The data buffer of claim 5, wherein said memory 
and said status register are responsive to an ad- 
dress value and wherein said address values for 
write operations to said memory and for read oper- 
ations from said status register are the same. 

so 

8. The data buffer of claim 5, wherein said write enable 
generator blocks said write strobe signal from said 
producer to said memory when said full output is 
asserted after said status register has presented 
said status signal to said producer &s 

9. The data buffer of claim 5, wherein said write enable 
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blocking said write strobe signaf from said 
memory when said available space count is ze- 
ro and otherwise passing the write strobe signal 
from said producer to said memory to store said 
write data. 5 



15. The method of claim 14, wherein said determining 
step further comprises the step of latching said 
available space count. 



18 

change memory; and 

determining whether said locked exchange 
memory has accepted said data, wherein said 
reading and immediately writing steps are re- 
sponsive to an address value and wherein said 
address values for said reading and immediate- 
ly writing steps are the same. 

23. The method of claim 22, wherein said determining 
to step further comprises the step of comparing said 
available space count with zero 



16, The method of claim 14, wherein said receiving a 
status read step and said receiving write data and 
a write strobe signal step are responsive to an ad- 
dress value and wherein said address values for 
said receiving steps are the same. is 

17* The method of claim 14, wherein said memory has 
a read strobe input for receiving a read strobe sig- 
nal, the method further comprising the step of pro- 
viding said data from said memory to a consumer 20 
upon receipt of a read strobe signal from said con- 
sumer 

18. The method of claim 17, wherein said memory has 

a memory capacity and wherein said determining 25 
step further comprises the steps of: 

counting a difference between said read strobe 
signal assertion and said write strobe signal as- 
sertion; 30 
comparing said difference with said memory 
capacity; and 

latching the resuft of said comparing step. 

19. The method of claim 18, wherein said determining zs 
step further comprises the step of subtracting said 
difference from said memory capacity to generate 
said available space count 

20. The method of claim 14, wherein said memory has *o 
write address inputs for receiving a write address 
signal, the method further comprising the step of in- 
crementing said write address signal upon receipt 

of said write strobe signal. 

45 

21. The method of claim 14, wherein said memory has 
read address inputs for receiving a read address 
signal, the method further comprising the step of in- 
crementing said read address signal upon receipt 

of said read strobe signal. 50 

22. A method for sequentially storing data from a pro- 
ducer into a locked exchange memory, said method 
comprising the steps of: 



55 



reading an available space count from said 

locked exchange memory; 

immediately writing said data to said locked ex- 
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